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Since the identification of theSARS-CoV-2 in Wuhan, China, in January 2020 
(1), the origin of the virus has been a topic of intense scientific debate and 
public speculation. The two main hypotheses are that the virus emerged 
from human exposure to an infected animal [“zoonosis” (2)] or that it 
emerged in a research-related incident (3). The investigation into the origin 
of the virus has been made difficult by the lack of key evidence from the ear- 
liest days of the outbreak—there’s no doubt that greater transparency on 
the part of Chinese authorities would be enormously helpful. Nevertheless, 
we argue here that there is much important information that can be gleaned 
from US-based research institutions, information not yet made available for 
independent, transparent, and scientific scrutiny. 

The data available within the United States would explicitly include, but 
are not limited to, viral sequences gathered and held as part of the PREDICT 
project and other funded programs, as well as sequencing data and labo- 
ratory notebooks from US laboratories. We call on US government scientific 
agencies, most notably the NIH, to support a full, independent, and transpar- 
ent investigation of the origins of SARS-CoV-2. This should take place, for 
example, within a tightly focused science-based bipartisan Congressional 
inquiry with full investigative powers, which would be able to ask important 
questions—but avoid misguided witch-hunts governed more by politics than 
by science. 
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When it comes to deciphering the origins of 
COVID-19, much important information can be 
gleaned from US-based research institutions— 
information that has yet to be made available 
for independent, transparent, and scientific 
scrutiny. Image credit: Dave Cutler (artist). 
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Essential US Investigations 


The US intelligence community (IC) was tasked, in 2021 by 
President Joe Biden (4), with investigating the origin of the 
virus. In their summary public statement, the IC writes that 
“all agencies assess that two hypotheses are plausible: nat- 
ural exposure to an infected animal and a laboratory- 
associated incident” (4). The IC further writes that “China's 
cooperation most likely would be needed to reach a con- 
clusive assessment of the origins of COVID-19 [coronavirus 
disease 2019].” Of course, such cooperation is highly 
warranted and should be pursued by the US Government 
and the US scientific community. Yet, as outlined below, 
much could be learned by investigating US-supported and 
US-based work that was underway in collaboration with 
Wuhan-based institutions, including the Wuhan Institute of 
Virology (WIV), China. It is still not clear whether the IC 
investigated these US-supported and US-based activities. If 
it did, it has yet to make any of its findings available to the 
US scientific community for independent and transparent 
analysis and assessment. If, on the other hand, the IC 
did not investigate these US-supported and US-based 
activities, then it has fallen far short of conducting a 
comprehensive investigation. 

This lack of an independent and transparent US-based 
scientific investigation has had four highly adverse con- 
sequences. First, public trust in the ability of US scientific 
institutions to govern the activities of US science in a 
responsible manner has been shaken. Second, the investi- 
gation of the origin of SARS-CoV-2 has become politicized 
within the US Congress (5); as a result, the inception of an 
independent and transparent investigation has been 
obstructed and delayed. Third, US researchers with deep 
knowledge of the possibilities of a laboratory-associated 
incident have not been enabled to share their expertise 
effectively. Fourth, the failure of NIH, one of the main fun- 
ders of the US-China collaborative work, to facilitate the 
investigation into the origins of SARS-CoV-2 (4) has fos- 
tered distrust regarding US biodefense research activities. 

Much of the work on SARS-like CoVs performed in Wuhan 
was part of an active and highly collaborative US-China sci- 
entific research program funded by the US Government 
(NIH, Defense Threat Reduction Agency [DTRA], and US 
Agency for International Development [USAID]), coordinated 
by researchers at EcoHealth Alliance (EHA), but involving 
researchers at several other US institutions. For this reason, 
it is important that US institutions be transparent about any 
knowledge of the detailed activities that were underway in 
Wuhan and in the United States. The evidence may also sug- 
gest that research institutions in other countries were 
involved, and those too should be asked to submit relevant 
information (e.g., with respect to unpublished sequences). 

Participating US institutions include the EHA, the Univer- 
sity of North Carolina (UNC), the University of California at 
Davis (UCD), the NIH, and the USAID. Under a series of NIH 
grants and USAID contracts, EHA coordinated the collec- 
tion of SARS-like bat CoVs from the field in southwest 
China and southeast Asia, the sequencing of these viruses, 
the archiving of these sequences (involving UCD), and the 
analysis and manipulation of these viruses (notably at 
UNC). A broad spectrum of coronavirus research work was 
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done not only in Wuhan (including groups at Wuhan Uni- 
versity and the Wuhan CDC, as well as WIV) but also in the 
United States. The exact details of the fieldwork and labo- 
ratory work of the EHA-WIV-UNC partnership, and the 
engagement of other institutions in the United States and 
China, has not been disclosed for independent analysis. 
The precise nature of the experiments that were con- 
ducted, including the full array of viruses collected from 
the field and the subsequent sequencing and manipulation 
of those viruses, remains unknown. 

EHA, UNC, NIH, USAID, and other research partners 
have failed to disclose their activities to the US scientific 
community and the US public, instead declaring that they 
were not involved in any experiments that could have 
resulted in the emergence of SARS-CoV-2. The NIH has 
specifically stated (6) that there is a significant evolutionary 
distance between the published viral sequences and that 
of SARS-CoV-2 and that the pandemic virus could not have 
resulted from the work sponsored by NIH. Of course, this 
statement is only as good as the limited data on which it is 
based, and verification of this claim is dependent on gain- 
ing access to any other unpublished viral sequences that 
are deposited in relevant US and Chinese databases (7,8). 
On May 11, 2022, Acting NIH Director Lawrence Tabak tes- 
tified before Congress that several such sequences in a US 
database were removed from public view, and that this 
was done at the request of both Chinese and US 
investigators. 

Blanket denials from the NIH are no longer good enough. 
Although the NIH and USAID have strenuously resisted full 
disclosure of the details of the EHA-WIV-UNC work program, 
several documents leaked to the public or released through 
the Freedom of Information Act (FOIA) have raised concerns. 
These research proposals make clear that the EHA-WIV-UNC 
collaboration was involved in the collection of a large number 
of so-far undocumented SARS-like viruses and was engaged 
in their manipulation within biological safety level (BSL)-2 and 
BSL-3 laboratory facilities, raising concerns that an airborne 
virus might have infected a laboratory worker (9). A variety of 
scenarios have been discussed by others, including an infec- 
tion that involved a natural virus collected from the field or 
perhaps an engineered virus manipulated in one of the labo- 
ratories (3). 


Overlooked Details 


Special concerns surround the presence of an unusual 
furin cleavage site (FCS) in SARS-CoV-2 (10) that augments 
the pathogenicity and transmissibility of the virus relative 
to related viruses like SARS-CoV-1 (11, 12). SARS-CoV-2 is, 
to date, the only identified member of the subgenus 
sarbecovirus that contains an FCS, although these are 
present in other coronaviruses (13, 14). A portion of the 
sequence of the spike protein of some of these viruses is 
illustrated in the alignment shown in Fig. 1, illustrating the 
unusual nature of the FCS and its apparent insertion in 
SARS-CoV-2 (15). From the first weeks after the genome 
sequence of SARS-CoV-2 became available, researchers 
have commented on the unexpected presence of the FCS 
within SARS-CoV-2—the implication being that SARS-CoV-2 
might be a product of laboratory manipulation. In a review 
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Fig. 1. This alignment of the amino acid sequences of coronavirus spike proteins, in the region of the S1/S2 junction, illustrates the sequence of 
SARS-CoV-2 (Wuhan-Hu-1) and some of its closest relatives. The furin cleavage site (FCS) is indicated (PRRAR'SVAS), and furin cuts the spike protein 
between R and S, as indicated by the red arrowhead. Adapted from Chan & Zhan (15). 


piece arguing against this possibility, it was asserted that 
the amino acid sequence of the FCS in SARS-CoV-2 is an 
unusual, nonstandard sequence for an FCS and that 
nobody in a laboratory would design such a novel FCS (13). 
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In fact, the assertion that the FCS in SARS-CoV-2 has an 
unusual, nonstandard amino acid sequence is false. The 
amino acid sequence of the FCS in SARS-CoV-2 also exists 
in the human ENaC o@ subunit (16), where it is known to be 
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Fig. 2. Amino acid alignment of the furin cleavage sites of SARS-CoV-2 
spike protein with (Top) the spike proteins of other viruses that lack 


the furin cleavage site and (Bottom) the furin cleavage sites present in the 
a subunits of human and mouse ENaC. Adapted from Anand et al. (16). 


functional and has been extensively studied (17, 18). The 
FCS of human ENaC a has the amino acid sequence 
RRAR'SVAS (Fig. 2), an eight-amino-acid sequence that is 
perfectly identical with the FCS of SARS-CoV-2 (16). ENaC is 
an epithelial sodium channel, expressed on the apical 
surface of epithelial cells in the kidney, colon, and airways 
(19, 20), that plays a critical role in controlling fluid 
exchange. The ENaC « subunit has a functional FCS (17, 18) 
that is essential for ion channel function (19) and has been 
characterized in a variety of species. The FCS sequence of 
human ENaC a (20) is identical in chimpanzee, bonobo, 
orangutan, and gorilla (S/ Appendix, Fig. 1), but diverges in all 
other species, even primates, except one. (The one non- 
human non-great ape species with the same sequence is 
Pipistrellus kuhlii, a bat species found in Europe and Western 
Asia; other bat species, including Rhinolophus ferrumequinem, 
have a different FCS sequence in ENaC « [RKAR'SAAS]). 

One consequence of this “molecular mimicry” between 
the FCS of SARS CoV-2 spike and the FCS of human ENaC is 
competition for host furin in the lumen of the Golgi appa- 
ratus, where the SARS-CoV-2 spike is processed. This 
results in a decrease in human ENaC expression (21). A 
decrease in human ENaC expression compromises airway 
function and has been implicated as a contributing factor 
in the pathogenesis of COVID-19 (22). Another conse- 
quence of this astonishing molecular mimicry is evidenced 
by apparent cross-reactivity with human ENaC of anti- 
bodies from COVID-19 patients, with the highest levels of 
cross-reacting antibodies directed against this epitope 
being associated with most severe disease (23). 

We do not know whether the insertion of the FCS was 
the result of natural evolution (2, 13)—perhaps via a 
recombination event in an intermediate mammal or a 
human (13, 24)—or was the result of a deliberate introduc- 
tion of the FCS into a SARS-like virus as part of a laboratory 
experiment. We do know that the insertion of such FCS 
sequences into SARS-like viruses was a specific goal of 
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work proposed by the EHA-WIV-UNC partnership within a 
2018 grant proposal (“DEFUSE”) that was submitted to the 
US Defense Advanced Research Projects Agency (DARPA) 
(25). The 2018 proposal to DARPA was not funded, but we 
do not know whether some of the proposed work was sub- 
sequently carried out in 2018 or 2019, perhaps using 
another source of funding. 

We also know that that this research team would be 
familiar with several previous experiments involving the 
successful insertion of an FCS sequence into SARS-CoV-1 
(26) and other coronaviruses, and they had a lot of experi- 
ence in construction of chimeric SARS-like viruses (27-29). 
In addition, the research team would also have some 
familiarity with the FCS sequence and the FCS-dependent 
activation mechanism of human ENaC « (19), which was 
extensively characterized at UNC (17, 18). For a research 
team assessing the pandemic potential of SARS-related 
coronaviruses, the FCS of human ENaC—an FCS known to 
be efficiently cleaved by host furin present in the target 
location (epithelial cells) of an important target organ 
(lung), of the target organism (human)—might be a ratio- 
nal, if not obvious, choice of FCS to introduce into a virus 
to alter its infectivity, in line with other work performed 
previously. 

Of course, the molecular mimicry of ENaC within the 
SARS-CoV-2 spike protein might be a mere coincidence, 
although one with a very low probability. The exact FCS 
sequence present in SARS-CoV-2 has recently been intro- 
duced into the spike protein of SARS-CoV-1 in the labora- 
tory, in an elegant series of experiments (12, 30), with 
predictable consequences in terms of enhanced viral 
transmissibility and pathogenicity. Obviously, the creation 
of such SARS-1/2 “chimeras” is an area of some concern 
for those responsible for present and future regulation of 
this area of biology. [Note that these experiments in ref. 
30 were done in the context of a safe “pseudotyped” virus 
and thus posed no danger of producing or releasing a 
novel pathogen.] These simple experiments show that the 
introduction of the 12 nucleotides that constitute the FCS 
insertion in SARS-CoV-2 would not be difficult to achieve in 
a lab. It would therefore seem reasonable to ask that elec- 
tronic communications and other relevant data from US 
groups should be made available for scrutiny. 


Seeking Transparency 


To date, the federal government, including the NIH, has 
not done enough to promote public trust and transpar- 
ency in the science surrounding SARS-CoV-2. A steady 
trickle of disquieting information has cast a darkening 
cloud over the agency. The NIH could say more about the 
possible role of its grantees in the emergence of SARS- 
CoV-2, yet the agency has failed to reveal to the public the 
possibility that SARS-CoV-2 emerged from a research- 
associated event, even though several researchers raised 
that concern on February 1, 2020, in a phone conversation 
that was documented by email (5). Those emails were 
released to the public only through FOIA, and they suggest 
that the NIH leadership took an early and active role in 
promoting the “zoonotic hypothesis” and the rejection of 
the laboratory-associated hypothesis (5). The NIH has 


pnas.org 


Downloaded from https://www.pnas.org by 50.117.77.36 on July 12, 2022 from IP address 50.117.77.36. 


resisted the release of important evidence, such as the 
grant proposals and project reports of EHA, and has con- 
tinued to redact materials released under FOIA, including a 
remarkable 290-page redaction in a recent FOIA release. 

Information now held by the research team headed by 
EHA (7), as well as the communications of that research 
team with US research funding agencies, including NIH, 
USAID, DARPA, DTRA, and the Department of Homeland 
Security, could shed considerable light on the experiments 
undertaken by the US-funded research team and on the 
possible relationship, if any, between those experiments 
and the emergence of SARS-CoV-2. We do not assert that 
laboratory manipulation was involved in the emergence of 
SARS-CoV-2, although it is apparent that it could have 
been. However, we do assert that there has been no inde- 
pendent and transparent scientific scrutiny to date of the 
full scope of the US-based evidence. 

The relevant US-based evidence would include the fol- 
lowing information: laboratory notebooks, virus databases, 
electronic media (emails, other communications), biologi- 
cal samples, viral sequences gathered and held as part of 
the PREDICT project (7) and other funded programs, and 
interviews of the EHA-led research team by independent 
researchers, together with a full record of US agency 
involvement in funding the research on SARS-like viruses, 
especially with regard to projects in collaboration with 
Wuhan-based institutions. We suggest that a bipartisan 
inquiry should also follow up on the tentative conclusion 
of the IC (4) that the initial outbreak in Wuhan may have 
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occurred no later than November 2019 and that therefore 
the virus was circulating before the cluster of known clini- 
cal cases in December. The IC did not reveal the evidence 
for this statement, nor when parts of the US Government 
or US-based researchers first became aware of a potential 
new outbreak. Any available information and knowledge of 
the earliest days of the outbreak, including viral sequences 
(8), could shed considerable light on the origins question. 
We continue to recognize the tremendous value of 
US-China cooperation in ongoing efforts to uncover the 
proximal origins of the pandemic. Much vital information 
still resides in China, in the laboratories, hospital samples, 
and early epidemiological information not yet available to 
the scientific community. Yet a US-based investigation 
need not wait—there is much to learn from the US institu- 
tions that were extensively involved in research that may 
have contributed to, or documented the emergence of, the 
SARS-CoV-2 virus. Only an independent and transparent 
investigation, perhaps as a bipartisan Congressional inquiry, 
will reveal the information that is needed to enable a 
thorough scientific process of scrutiny and evaluation. 
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